skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Wu, Mingxuan"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Humans can learn to manipulate new objects by simply watching others; providing robots with the ability to learn from such demonstrations would enable a natural interface specifying new behaviors. This work develops Robot See Robot Do (RSRD), a method for imitating articulated object manipulation from a single monocular RGB human demonstration given a single static multi-view object scan. We first propose 4D Differentiable Part Models (4D-DPM), a method for recovering 3D part motion from a monocular video with differentiable rendering. This analysis-by-synthesis approach uses part-centric feature fields in an iterative optimization which enables the use of geometric regularizers to recover 3D motions from only a single video. Given this 4D reconstruction, the robot replicates object trajectories by planning bimanual arm motions that induce the demonstrated object part motion. By representing demonstrations as part-centric trajectories, RSRD focuses on replicating the demonstration's intended behavior while considering the robot's own morphological limits, rather than attempting to reproduce the hand's motion. We evaluate 4D-DPM's 3D tracking accuracy on ground truth annotated 3D part trajectories and RSRD's physical execution performance on 9 objects across 10 trials each on a bimanual YuMi robot. Each phase of RSRD achieves an average of 87% success rate, for a total end-to-end success rate of 60% across 90 trials. Notably, this is accomplished using only feature fields distilled from large pretrained vision models -- without any task-specific training, fine-tuning, dataset collection, or annotation. 
    more » « less
  2. Grouping is inherently ambiguous due to the multiple levels of granularity in which one can decompose a scene -- should the wheels of an excavator be considered separate or part of the whole? We present Group Anything with Radiance Fields (GARField), an approach for decomposing 3D scenes into a hierarchy of semantically meaningful groups from posed image inputs. To do this we embrace group ambiguity through physical scale: by optimizing a scale-conditioned 3D affinity feature field, a point in the world can belong to different groups of different sizes. We optimize this field from a set of 2D masks provided by Segment Anything (SAM) in a way that respects coarse-to-fine hierarchy, using scale to consistently fuse conflicting masks from different viewpoints. From this field we can derive a hierarchy of possible groupings via automatic tree construction or user interaction. We evaluate GARField on a variety of in-the-wild scenes and find it effectively extracts groups at many levels: clusters of objects, objects, and various subparts. GARField inherently represents multi-view consistent groupings and produces higher fidelity groups than the input SAM masks. GARField's hierarchical grouping could have exciting downstream applications such as 3D asset extraction or dynamic scene understanding. See the project website at https://www.garfield.studio/ 
    more » « less
  3. Abstract Atmospheric chemistry plays a crucial role in Earth system models (ESMs), controlling atmospheric composition and radiative balance; it is highly interactive with the physical climate, biogeochemical cycles, and human systems. However, it often imposes computational challenges in an ESM. Here we develop a full troposphere‐stratosphere interactive chemistry module for the US Department of Energy's Energy Exascale Earth System Model (E3SM). We intentionally build a streamlined module based on E3SM version 2 that interacts with other components and maintains all of major chemical and chemistry‐climate feedbacks. The module incorporates a new, highly efficient tracer advection scheme; linearization of stratospheric chemistry; and abridged tropospheric chemical mechanism with 28 reactive tracers. This new model, E3SM‐chem, can readily perform century‐long climate simulations of ozone, methane, and nitrous oxide based on emission scenarios as well as provide hourly budgets for the gas‐phase radicals that drive aerosol chemistry. We evaluate E3SM‐chem with an atmosphere‐only simulation as in the recent climate model intercomparison project (CMIP6) finding results similar to the other CMIP6 models. For the present‐day, E3SM‐chem matches the standard measurement metrics for stratospheric and tropospheric ozone, surface air quality, other key reactive gases like carbon monoxide, and the methane lifetime. Overall, E3SM‐chem maintains the climate fidelity of the baseline model while adding at most 20% to the computational cost of the atmosphere model. Hence, interactive chemistry can be a default configuration for long climate simulations at resolutions of 1° or finer, which is crucial for producing self‐consistent chemistry‐climate feedbacks that alter the climate system. 
    more » « less
    Free, publicly-accessible full text available October 1, 2026
  4. Abstract This paper describes the atmospheric component of the US Department of Energy's Energy Exascale Earth System Model (E3SM) version 3. Significant updates have been made to the atmospheric physics compared to earlier versions. Specifically, interactive gas chemistry has been implemented, along with improved representations of aerosols and dust emissions. A new stratiform cloud microphysics scheme more physically treats ice processes and aerosol‐cloud interactions. The deep convection parameterization has been largely improved with sophisticated microphysics for convective clouds, making model convection sensitive to large‐scale dynamics, and incorporating the dynamical and physical effects of organized mesoscale convection. Improvements in aerosol wet removal processes and parameter re‐tuning of key aerosol and cloud processes have improved model aerosol radiative forcing. The model's vertical resolution has increased from 72 to 80 layers with the extra eight layers added in the lower stratosphere to better simulate the Quasi‐Biennial Oscillation. These improvements have enhanced E3SM's capability to couple aerosol, chemistry, and biogeochemistry and reduced some long‐standing biases in simulating tropical variability. Compared to its predecessors, the model shows a much stronger signal for the Madden‐Julian Oscillation, Kelvin waves, mixed Rossby‐gravity waves, and eastward inertia‐gravity waves. Aerosol radiative forcing has been considerably reduced and is now better aligned with community best estimates, leading to significantly improved skill in simulating historical temperature records. Its simulated mean‐state climate is largely comparable to E3SMv2, but with some notable degradation in shortwave cloud radiative effect, precipitable water, and surface wind stress, which will be addressed in future updates. 
    more » « less
    Free, publicly-accessible full text available October 1, 2026